AITopics | kernel score

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.46)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsDec-24-2025, 05:52:05 GMT

Non-adversarial training of Neural SDEs with signature kernel scores

Neural SDEs are continuous-time generative models for sequential data. State-of-the-art performance for irregular time series generation has been previously obtained by training these models adversarially as GANs. However, as typical for GAN architectures, training is notoriously unstable, often suffers from mode collapse, and requires specialised techniques such as weight clipping and gradient penalty to mitigate these issues. In this paper, we introduce a novel class of scoring rules on pathspace based on signature kernels and use them as objective for training Neural SDEs non-adversarially. By showing strict properness of such kernel scores and consistency of the corresponding estimators, we provide existence and uniqueness guarantees for the minimiser. With this formulation, evaluating the generator-discriminator pair amounts to solving a system of linear path-dependent PDEs which allows for memory-efficient adjoint-based backpropagation. Moreover, because the proposed kernel scores are well-defined for paths with values in infinite dimensional spaces of functions, our framework can be easily extended to generate spatiotemporal data. Our procedure significantly outperforms alternative ways of training Neural SDEs on a variety of tasks including the simulation of rough volatility models, the conditional probabilistic forecasts of real-world forex pairs where the conditioning variable is an observed past trajectory, and the mesh-free generation of limit order book dynamics.

kernel score, neural sde, non-adversarial training, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Bülte, Christopher, Sale, Yusuf, Kutyniok, Gitta, Hüllermeier, Eyke

Uncertainty Quantification for Regression: A Unified Framework based on kernel scores

arXiv.org Artificial IntelligenceOct-30-2025

Regression tasks, notably in safety-critical domains, require proper uncertainty quantification, yet the literature remains largely classification-focused. In this light, we introduce a family of measures for total, aleatoric, and epistemic uncertainty based on proper scoring rules, with a particular emphasis on kernel scores. The framework unifies several well-known measures and provides a principled recipe for designing new ones whose behavior, such as tail sensitivity, robustness, and out-of-distribution responsiveness, is governed by the choice of kernel. We prove explicit correspondences between kernel-score characteristics and downstream behavior, yielding concrete design guidelines for task-specific measures. Extensive experiments demonstrate that these measures are effective in downstream tasks and reveal clear trade-offs among instantiations, including robustness and out-of-distribution detection performance.

artificial intelligence, machine learning, uncertainty measure, (17 more...)

arXiv.org Artificial Intelligence

2510.25599

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Modeling & Simulation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsOct-8-2025, 07:10:30 GMT

Non-adversarial training of Neural SDEs with signature kernel scores

artificial intelligence, machine learning, sig, (14 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.46)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningAug-26-2025

A Novel Framework for Uncertainty Quantification via Proper Scores for Classification and Beyond

Gruber, Sebastian G.

In this PhD thesis, we propose a novel framework for uncertainty quantification in machine learning, which is based on proper scores. Uncertainty quantification is an important cornerstone for trustworthy and reliable machine learning applications in practice. Usually, approaches to uncertainty quantification are problem-specific, and solutions and insights cannot be readily transferred from one task to another. Proper scores are loss functions minimized by predicting the target distribution. Due to their very general definition, proper scores apply to regression, classification, or even generative modeling tasks. We contribute several theoretical results, that connect epistemic uncertainty, aleatoric uncertainty, and model calibration with proper scores, resulting in a general and widely applicable framework. We achieve this by introducing a general bias-variance decomposition for strictly proper scores via functional Bregman divergences. Specifically, we use the kernel score, a kernel-based proper score, for evaluating sample-based generative models in various domains, like image, audio, and natural language generation. This includes a novel approach for uncertainty estimation of large language models, which outperforms state-of-the-art baselines. Further, we generalize the calibration-sharpness decomposition beyond classification, which motivates the definition of proper calibration errors. We then introduce a novel estimator for proper calibration errors in classification, and a novel risk-based approach to compare different estimators for squared calibration errors. Last, we offer a decomposition of the kernel spherical score, another kernel-based proper score, allowing a more fine-grained and interpretable evaluation of generative image models.

large language model, machine learning, natural language, (19 more...)

doi: 10.21248/gups.93204

2508.18001

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Minnesota (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
(6 more...)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.45)
Energy > Power Industry (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Waghmare, Kartik, Ziegel, Johanna

Proper scoring rules for estimation and forecast evaluation

arXiv.org Machine LearningApr-2-2025

In recent years, proper scoring rules have emerged as a power ful general approach for estimating probability distributions. In addition to significantly ex panding the range of modeling techniques that can be applied in practice, this has also substantially broadened the conceptual understanding of estimation methods. Originally, proper scoring rules we re conceived in meteorology as summary statistics for describing the performance of probabilisti c forecasts ( Murphy and Winkler, 1984), but they also play an important role in economics as tools for bel ief elicitation ( Schotter and Trevino, 2014). A probabilistic forecast is a probability distribution ove r the space of the possible outcomes of the future event that is stated by the forecaster. The simple st and most popular case of probabilistic forecasts arises when the outcome is binary, so the probabilistic forecast reduces to issuing a predictive probability of success. Brier ( 1950) was the first to consider the problem of devising a scoring rule which could not be "played" by a dishonest fore casting agent. He introduced the quadratic scoring rule and showed that it incentivizes a for ecasting agent to state his most accurate probability estimate when faced with uncertainty.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

2504.01781

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York > New York County > New York City (0.04)
(9 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games (1.00)
Energy (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)

Allen, Sam, Ginsbourger, David, Ziegel, Johanna

Efficient pooling of predictions via kernel embeddings

arXiv.org Machine LearningNov-25-2024

Probabilistic predictions are probability distributions over the set of possible outcomes. Such predictions quantify the uncertainty in the outcome, making them essential for effective decision making. By combining multiple predictions, the information sources used to generate the predictions are pooled, often resulting in a more informative forecast. Probabilistic predictions are typically combined by linearly pooling the individual predictive distributions; this encompasses several ensemble learning techniques, for example. The weights assigned to each prediction can be estimated based on their past performance, allowing more accurate predictions to receive a higher weight. This can be achieved by finding the weights that optimise a proper scoring rule over some training data. By embedding predictions into a Reproducing Kernel Hilbert Space (RKHS), we illustrate that estimating the linear pool weights that optimise kernel-based scoring rules is a convex quadratic optimisation problem. This permits an efficient implementation of the linear pool when optimally combining predictions on arbitrary outcome domains. This result also holds for other combination strategies, and we additionally study a flexible generalisation of the linear pool that overcomes some of its theoretical limitations, whilst allowing an efficient implementation within the RKHS framework. These approaches are compared in an application to operational wind speed forecasts, where this generalisation is found to offer substantial improvements upon the traditional linear pool.

linear pool, prediction, predictive distribution, (16 more...)

2411.16246

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsOct-10-2024, 12:35:24 GMT

Non-adversarial training of Neural SDEs with signature kernel scores

kernel score, neural sde, non-adversarial training, (2 more...)

Technology: Information Technology > Artificial Intelligence (0.60)

Gruber, Sebastian G., Buettner, Florian

A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models

arXiv.org Machine LearningOct-9-2023

Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of uncertainty estimation is commonly solved in an ad-hoc manner and task dependent. For example, natural language approaches cannot be transferred to image generation. In this paper we introduce the first bias-variance-covariance decomposition for kernel scores and their associated entropy. We propose unbiased and consistent estimators for each quantity which only require generated samples but not the underlying model itself. As an application, we offer a generalization evaluation of diffusion models and discover how mode collapse of minority groups is a contrary phenomenon to overfitting. Further, we demonstrate that variance and predictive kernel entropy are viable measures of uncertainty for image, audio, and language generation. Specifically, our approach for uncertainty estimation is more predictive of performance on CoQA and TriviaQA question answering datasets than existing baselines and can also be applied to closed-source models.

cov, large language model, machine learning, (18 more...)

2310.05833

Country:

Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Issa, Zacharia, Horvath, Blanka, Lemercier, Maud, Salvi, Cristopher

Non-adversarial training of Neural SDEs with signature kernel scores

arXiv.org Artificial IntelligenceMay-25-2023

Neural SDEs are continuous-time generative models for sequential data. State-of-the-art performance for irregular time series generation has been previously obtained by training these models adversarially as GANs. However, as typical for GAN architectures, training is notoriously unstable, often suffers from mode collapse, and requires specialised techniques such as weight clipping and gradient penalty to mitigate these issues. In this paper, we introduce a novel class of scoring rules on pathspace based on signature kernels and use them as objective for training Neural SDEs non-adversarially. By showing strict properness of such kernel scores and consistency of the corresponding estimators, we provide existence and uniqueness guarantees for the minimiser. With this formulation, evaluating the generator-discriminator pair amounts to solving a system of linear path-dependent PDEs which allows for memory-efficient adjoint-based backpropagation. Moreover, because the proposed kernel scores are well-defined for paths with values in infinite dimensional spaces of functions, our framework can be easily extended to generate spatiotemporal data. Our procedure permits conditioning on a rich variety of market conditions and significantly outperforms alternative ways of training Neural SDEs on a variety of tasks including the simulation of rough volatility models, the conditional probabilistic forecasts of real-world forex pairs where the conditioning variable is an observed past trajectory, and the mesh-free generation of limit order book dynamics.

artificial intelligence, machine learning, sig, (14 more...)

arXiv.org Artificial Intelligence

2305.16274

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)